智能论文笔记

NECE: Narrative Event Chain Extraction Toolkit

Guangxuan Xu , Paulina Toro Isaza , Moshi Li , Akintoye Oloko , Bingsheng Yao , Aminat Adebeyi , Yufang Hou , Nanyun Peng , Dakuo Wang

分类：人工智能 | 自然语言处理

2022-08-17

NECE是一个基于事件的文本分析工具包，用于叙事文档。NECE的目的是通过图形界面和Python软件包为用户提供开放且轻松地访问基于事件的摘要和长期叙事文档的抽象，这些软件包可以很容易地用于叙事分析，理解或其他高级目的。我们的工作解决了长期通过事件提取和关键事件的时间顺序的挑战；同时，它提供了选择和查看与叙述实体有关的事件（例如主要角色和性别群体）的选项。我们进行人类评估以证明事件链提取系统的质量，并且角色具有挖掘算法。最后，我们通过证明其在性别偏见分析和提问任务中的用法来阐明该工具包的潜在下游应用程序。

translated by 谷歌翻译

Label Sleuth: From Unlabeled Text to a Classifier in a Few Hours

Eyal Shnarch , Alon Halfon , Ariel Gera , Marina Danilevsky , Yannis Katsis , Leshem Choshen , Martin Santillan Cooper , Dina Epelboim , Zheng Zhang , Dakuo Wang

分类：自然语言处理

2022-08-02

文本分类在许多真实世界的情况下可能很有用，为最终用户节省了很多时间。但是，构建自定义分类器通常需要编码技能和ML知识，这对许多潜在用户构成了重大障碍。为了提高此障碍，我们介绍了标签侦探，这是一种免费的开源系统，用于标记和创建文本分类器。该系统对于（a）是一个无代码系统是独一无二的分类器在几个小时内，（c）开发用于开发人员进行配置和扩展。通过开放采购标签侦探，我们希望建立一个用户和开发人员社区，以扩大NLP模型的利用率。

translated by 谷歌翻译

Towards KAB2S: Learning Key Knowledge from Single-Objective Problems to Multi-Objective Problem

Xu Wendi , Wang Xianpeng , Guo Qingxin , Song Xiangman , Zhao Ren , Zhao Guodong , Yang Yang , Xu Te , He Dakuo

分类：神经与进化计算 | 人工智能

2022-06-26

作为“进化计算研究中的新领域”，进化转移优化（ETO）将克服传统的零重复利用相关经验和知识的范式，这些范式在进化计算研究中解决了过去的问题。在通过ETO的计划申请中，可以为智能调度和绿色日程安排形成一个非常吸引人且高度竞争的框架“会议”，尤其是对于来自中国的“碳中立性”的誓言。据我们所知，当多目标优化问题“满足”离散案例中的单目标优化问题（而不是多任务优化）时，我们在此处安排的论文是一类ETO框架的第一项工作。更具体地说，可以通过新的核心转移机制和学习技巧来使用用于置换流程调度问题（PFSP）的新核心转移机制和学习技术，可以使用用于工业应用传达的关键知识，例如具有遗传算法的位置构建块。关于良好研究基准的广泛研究验证了我们提出的ETO-PFSP框架的企业有效性和巨大的普遍性。我们的调查（1）丰富了ETO框架，（2）有助于遗传算法和模因算法的基本基础的经典和基本理论，（3）（3）朝着通过范例和范式进行学习的范式进行进化调整的范式转移，中国“工业情报”的“基于知识和建筑块的计划”（KAB2S）。

translated by 谷歌翻译

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

Sebastian Gehrmann , Abhik Bhattacharjee , Abinaya Mahendiran , Alex Wang , Alexandros Papangelis , Aman Madaan , Angelina McMillan-Major , Anna Shvets , Ashish Upadhyay , Bingsheng Yao

分类：自然语言处理 | 人工智能 | 机器学习

2022-06-22

通常通过过去的选择来告知机器学习中的评估，例如要使用哪些数据集或指标。该标准化可以使用排行榜对平等基础进行比较，但是随着出现更好的替代方案，评估选择变得不佳。这个问题在自然语言生成中尤其相关，该语言需要不断改善的数据集，指标和人类评估以提出确定性的主张。为了使遵循最佳模型评估实践更加容易，我们介绍了GEMV2。新版本的一代，评估和指标基准为数据集，模型和指标开发人员提供了模块化基础架构，以使彼此受益。GEMV2支持40种记录的数据集中51种语言。所有数据集的模型都可以在线评估，我们的交互式数据卡创建和渲染工具使得在Living Benchmark中添加新数据集变得更加容易。

translated by 谷歌翻译

A Word is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools Stock Predictions

Yong Xie , Dakuo Wang , Pin-Yu Chen , Jinjun Xiong , Sijia Liu , Sanmi Koyejo

分类：机器学习

2022-05-01

越来越多的投资者和机器学习模型依靠社交媒体（例如Twitter和Reddit）来收集实时信息和情感以预测股票价格变动。尽管已知基于文本的模型容易受到对抗性攻击的影响，但库存预测模型是否具有相似的漏洞。在本文中，我们尝试了各种对抗性攻击配置，以欺骗三个股票预测受害者模型。我们通过解决语义和预算限制的组合优化问题来解决对抗生成的任务。我们的结果表明，提出的攻击方法可以通过简单地将扰动但语义上相似的推文连接来实现一致的成功率，并在交易模拟中造成巨大的货币损失。

translated by 谷歌翻译

Slack Channels Ecology in Enterprises: How Employees Collaborate Through Group Chat

Dakuo Wang , Haoyu Wang , Mo Yu , Zahra Ashktorab , Ming Tan

分类：机器学习

2019-06-04

尽管在组织中学习即时邮件使用的历史悠久，但我们非常了解今天的人们如何参与群聊频道并与他人互动。在此简短的说明中，我们的目标是更新关于群聊在当今组织的上下文中使用的现有知识。我们的特权在跨国IT公司中的R \＆D部门休闲收集了4300个公共可公共团体聊天渠道。通过定性编码100个通道，我们确定了9个频道类别，如项目的通道和事件通道。我们进一步定义了一个具有21个功能的特征度量来描述这些组聊天通道的组通信样式，我们成功培训了机器学习模型，该模型可以自动将给定组通道分类为9个类别之一。此外，我们说明了这些通信度量如何用于分析团队的协作活动。我们专注于117个项目团队，因为我们有其性能数据，并进一步收集了117个团队的Slack组数据中的54个，并为每个人生成了通信风格指标。通过这些数据，我们能够构建回归模型，以揭示这些组通信风格与项目团队性能的一个指标之间的关系。

translated by 谷歌翻译

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Junjie Yan , Yingfei Liu , Jianjian Sun , Fan Jia , Shuailin Li , Tiancai Wang , Xiangyu Zhang

分类：计算机视觉

2023-01-03

In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.

translated by 谷歌翻译

A Survey On Few-shot Knowledge Graph Completion with Structural and Commonsense Knowledge

Haodi Ma , Daisy Zhe Wang

分类：自然语言处理 | 人工智能 | 机器学习

2023-01-03

Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.

translated by 谷歌翻译

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

Yue Han , Jiangning Zhang , Zhucun Xue , Chao Xu , Xintian Shen , Yabiao Wang , Chengjie Wang , Yong Liu , Xiangtai Li

分类：计算机视觉

2023-01-03

Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.

translated by 谷歌翻译

RELIANT: Fair Knowledge Distillation for Graph Neural Networks

Yushun Dong , Binchi Zhang , Yiling Yuan , Na Zou , Qi Wang , Jundong Li

分类：机器学习

2023-01-03

Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.

translated by 谷歌翻译